Comparing Verb Synonym Resources for Portuguese
نویسندگان
چکیده
In this paper we compare verb synonym information contained in four public-available lexical-semantic resources for Portuguese: TeP, PAPEL, Wiktionary and OpenThesaurusPT. We quantify the extent to which verb synonymy information in four resources overlaps, and we quantify how much novelty each resource in comparison to the others. We demonstrate that the four resources vary significantly in respect to verb synonymy information. Also, we show that by merging the four resources we can obtain a more comprehensive verb thesaurus. Finally, we suggest that resource merging may actually be required in order to avoid performance and evaluation bias that arise from coverage problems when using only one of these resources.
منابع مشابه
Exploring the Vector Space Model for Finding Verb Synonyms in Portuguese
We explore the performance of the Vector Space Model (VSM) in finding verb synonyms in Portuguese by analyzing the impact of three operating parameters: (i) the weighting function, (ii) the context window used for automatically extracting features, and (iii) the minimum number of vector features. We rely on distributional statistics taken from a large n-gram database to build feature vectors, u...
متن کاملMapping Verbs in Different Languages to Knowledge Base Relations using Web Text as Interlingua
In recent years many knowledge bases (KBs) have been constructed, yet there is not yet a verb resource that maps to these growing KB resources. A resource that maps verbs in different languages to KB relations would be useful for extracting facts from text into the KBs, and to aid alignment and integration of knowledge across different KBs and languages. Such a multi-lingual verb resource would...
متن کاملVerb Clustering for Brazilian Portuguese
Levin-style classes which capture the shared syntax and semantics of verbs have proven useful for many Natural Language Processing (NLP) tasks and applications. However, lexical resources which provide information about such classes are only available for a handful of worlds languages. Because manual development of such resources is extremely time consuming and cannot reliably capture domain va...
متن کاملComparing and combining semantic verb classifications
In this article, we address the task of comparing and combining different semantic verb classifications within one language. We present a methodology for the manual analysis of individual resources on the level of semantic features. The resulting representations can be aligned across resources, and allow a contrastive analysis of these resources. In a case study on the Manner of Motion domain a...
متن کاملFinding High-Frequent Synonyms of A Domain-Specific Verb in English Sub-Language of MEDLINE Abstracts Using WordNet
The task of binary relation extraction in IE [3] is based mainly on high-frequent verbs and patterns. During the extraction of a specific relation from MEDLINE English abstracts, it is noticed that besides the high-frequent verb itself which represents the specific relation, some other word forms, such as the nominal and adjective forms of this verb, as well as its synonyms, also play a very im...
متن کامل